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1.0 Introduction 

Concepts and intuitions from probability and statistics pervade many aspects of our "non-mathematical" lives. 
When we decide whether to pay more for a car that is reported to be "more reliable"; when we choose to 
safeguard our children by living in a "low-crime" neighborhood; when we change our diet and exercise habits 
to lower the risk of heart disease; when we accept or refuse a job with other options pending; all the above 
actions involve implicit probability judgments. Arguably, no other area of mathematics can be said to apply so 
widely. No other area is so potentially empowering- so useful in making decisions, both intimate and 
professional, and affords us such opportunities for making sense of phenomena, both natural and social. 

Yet, as the research literature (discussed below) has made abundantly clear, students do not take great 
advantage of these powerful ways of making sense of the world. This paper seeks to give an account of this 
missed opportunity in terms of the framework of Connected Mathematics (Wilensky, 1993).The paper begins 
with a sketch of background issues related to learning probability and statistics. The Connected Probability 
Project (Wilensky, 1993; 1995a; 1995b) is then described and situated in the framework of Connected 
Mathematics. The notion of "epistemological anxiety" as a primary obstacle to learning probability and 
statistics is articulated and developed. The paper then describes how the use of computational environments - 
specifically object-based parallel modeling languages - can allow learners to successfully address 
epistemological anxiety in the realm of probability and statistics. The remainder of the paper seeks to 
instantiate the theory and practice of Connected Mathematics in case studies concerning learning about the 
concept of normal distribution. These cases illustrate how traditional teaching of the concept of normal 
distribution which relies on formalism and macro- level summary statistics leads to epistemological anxiety. 
The cases also illustrate how Connected Mathematics engages issues of epistemological anxiety and how it 
fosters a deeper understanding of normal distributions. This understanding is advanced through learners 
modeling normal distributions as emergent phenomena. 

1.1 The Subject Learners are Required to Hate 
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People's attitudes towards probability and statistics can be summed up in the well worn adage attributed to both 
Twain and Disraeli: "There are three kinds of lies: lies, damn lies, and statistics." (in Tripp, 1970) 

Probability & Statistics (henceforward, P&S ) research "methods" courses have achieved notoriety in the 
undergraduate and graduate curriculum as the bane of many a student Students in the social sciences are 
usually required to take a statistics course. Though such courses are intended to introduce them to concepts and 
methods which they can then employ in making sense of real data, they frequently report being able to work the 
textbook problems but having little idea of "what they mean," or how to apply them to novel situations (see, 
e.g., Phillips, 1988; {Rubin et al, 198x}). Students complain about being able to make several different 
arguments to solve a problem, all of which sound plausible yet yield different answers (Wilensky, 1993). How 
are they to choose among the plausible alternatives? 

This confusion and indecision leads to what I have termed "epistemological anxiety" (Wilensky, 1993)- a 
feeling, often in the background, that one does not comprehend the meanings, purposes, source or legitimacy of 
the mathematical objects one is manipulating and using. 

1.2 Responding to a literature of Innate Constraints 

There has been considerable research (e.g., Gould, 1991; Konold, 1991; Phillips, 1988; Piaget, 1975; Tversky 
& Kahneman, 1974) documenting difficulties people have in learning, understanding and using concepts of 
probability and statistics. Much of this research locates the source of the problem in cognitive constraints of the 
mind of the learner (see e.g., Cohen, 1979; Edwards & von Winterfeldt, 1986; Evans, 1993; Gould, 1991; 
Kahneman & Tversky, 1973; 1982; Nisbett, 1980; Nisbett et al, 1983; Tversky & Kahneman, 1974; 1980; 
1983). 

In their now classic work, Tversky & Kahneman (1982) document the persistent errors and "misconceptions" 
that people make when making probabilistic "judgments under uncertainty". Among these errors are systematic 
misconceptions about probabilities of conjunctions, inverse probabilities, updating probabilities in the face of 
new evidence and "seeing" non-random pattern in random data. These systematic errors are repeatable and 
don't seem to go away even when people have had significant training in probability. This contributes to a 
widespread belief that humans are incapable of thinking intuitively about probability. 

Tversky & Kahneman speculate as to the origin of these systematic biases in people's assessment of 
likelihoods. They theorize that people's mental resources are too limited to be able to generate probabilistically 
accurate judgments. Consequently, people are forced to fall back on computationally simpler "heuristics". In 
other words, we have been "hard-wired" not be able to think about probability and must circumvent our natural 
thinking processes in order to overcome this liability. 

The view that humans must circumvent natural thinking processes in the realm of probability has spawned a 
large literature and has become very influential. For the purposes of this exposition, let us call this view the 
"accommodationist" view. The effect of this accommodationist view on P&S instruction has been a deference 
to the mechanical manipulation of the tokens of formal notation as a guard against unreliable intuitions. This 
lesson for P&S instruction has reached the highest levels of education. In an introductory graduate course in 
probability and statistics at a major research university, the professor wrote down Bayes theorem for 
calculating inverse probabilities and then baldly announced: 

"Don't even try to do inverse probabilities in your head. Always use Bayes formula. As Tversky and Kahneman 
have shown, it is impossible for humans to get an intuitive feel for inverse probabilities", (in Wilensky, 1993). 

This quote is a pedagogical embodiment of the accomodationist view. 

It is possible to begin with a more conservative set of assumptions then do the accommodationists. We could 
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assume that P&S is an area of mathematics not widely separated from the rest of the discipline. Furthermore, 
we could posit a broadly developmental view of mathematical intuition which allows for conceptual change. 
We would then look elsewhere for the source of learners' difficulties with P&S. We would look at the learning 
methods, tools and cognitive technologies (Pea, 1987) available to the learner. Given the prevalence of this 
accomodationist view, it would be of great interest if it could be shown that a suitable learning environment 
could help learners transform their probabilistic biases into effective intuitions and a solid conceptual 
understanding of P&S. This work is being undertaken through the Connected Probability project. 

2.0 Connected Probability 

The mission of the Connected Probability project (Wilensky, 1993; 1995-a; 1995-b) is to better understand the 
source of learners 1 difficulties in P&S and to build learning environments that would foster the development of 
intuitive conceptions of foundational concepts and positive attitude towards the discipline. The Connected 
Probability project was launched with the hypothesis that learners' difficulties with P&S are not primarily due 
to hard- wired cognitive constraints. We conjecture that students' antipathy towards P&S is primarily due to 
epistemological anxiety concerning the basic concepts of the discipline. We further locate the source of the 
anxiety in the lack of sufficiently powerful learning environments for P&S. The paucity of the learning 
environment stems, in great part, from the unavailability of powerful tools for experimentation and 
construction in probability. The anxiety is further reinforced by the teaching practices employed in mathematics 
classrooms and a "protective" culture which discourages revelation of mathematical conceptions and process 
and its consequences for mathematical discourse in the classroom. Given this "diagnosis", it follows that 
learners' difficulties in P&S can be effectively addressed without resorting to either rigidly formalized 
instruction or neural surgery. 

In the initial year of the project, seventeen in depth interviews about P&S were conducted with learners age 
fourteen to sixty-four. Interviews were open ended and most often experienced by the interviewees as extended 
conversations. The interviewer guided these conversations so that the majority of a list of 23 topics was 
addressed. The interview topics ranged from attitudes toward situations of uncertainty, to interpretation of 
newspaper statistics, to the design of studies for collecting desired statistics and to formal probability problems. 
In most interviews, a computational modeling environment designed for experimenting with P&S was made 
available to the learners. 

2.1 Probability Distributions 

The interview format was designed to focus on several key concepts in P&S. The notion of Probability 
distribution is one such key concept. We focus on probability distribution, partly because of its importance: 1) 
an understanding of probability distribution is crucial to understanding the statistical models ubiquitous in 
scientific research. 2) The concept is equally central to participation in the public forum as an informed citizen. 
3) Without the concept of distribution learners cannot truly understand how events can be both unpredictable 
and constrained - we cannot have a coherent concept of randomness (Wilensky, 1993; 1995-b). 4) Probability 
distributions stand at the interface between the traditional study of probability and the traditional study of 
statistics and, thus, afford an opportunity to make strong connections between the two disciplines. Another 
reason we chose to focus on probability distribution is the potential for bringing about meaningful change in 
the learning experience for many students. In a typical course in probability and statistics, students are exposed 
to a standard library of distributions and associated formulae, but do not have a chance to construct these 
distributions and understand what "problems they are trying to solve". The availability of new computational 
"object-based parallel modeling languages" (Wilensky, forthcoming) affords learners the ability to construct 
these distributions as patterns emergent from probabilistic rules. Through these constructions, learners can 
make connections between probabilistic descriptions of discrete phenomena and the statistical descriptions of 
the ensemble in the process seeing the utility of both descriptions. Meaningful intervention is possible. 
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An analysis of the interview data revealed persistent confusions on the part of most interviewees about 
probability distributions. These confusions and attendant anxieties resulted in the interviewees turning off their 
sense-making with regard to probability distributions and merely invoking rote formulae when applying 
probability distributions to real world problems. 

3.0 Connected Mathematics 

Central themes from the Connected Mathematics research program.(Wilensky, 1993; 1995c) inform the 
Connected Probability project. The Connected Mathematics program takes broad aim at both our view of the 
discipline of mathematics and the practice of mathematics pedagogy. Connected Mathematics is situated in the 
Constructionist learning paradigm (Papert, 1991; 1993). The name Connected Mathematics comes from two 
seemingly disparate sources, the literature of emergent Artificial Intelligence (AI) and the literature of feminist 
critique. From emergent AI and, in particular, from the Society of Mind theory (Minsky, 1987; Papert, 1980), 
Connected Mathematics takes the idea that a concept cannot be intelligible if it has only one meaning, it is 
through connections that concepts gain meaning. The feminist literature (e.g., Belenky et al., 1986; Keller, 
1983; Gilligan, 1977; Surrey, 1991), contributes the idea of "connected knowing", a personal form of knowing 
that is intimate and contextualized as opposed to an alienated and disconnected formal knowing. Mathematical 
concepts derive their meaning and their power through their embeddedness in a personally and socially 
constructed web of connections to other ideas and experiences, both mathematical and non-mathematical. 

In a Connected Mathematics learning environment, the focus is on learner-owned investigative activities 
followed by reflection. Thus, students are not led through the mathematical "litany" of definition, theorem 
proof. Mathematical concepts are not simply given by formal definitions. Instead, mathematical concepts are 
multiply represented (e.g., Kaput, 1987 ; Von Glaserfeld, 1989) and the environment supports multiple styles 
and ways of knowing (see e.g., Turkle & Papert, 1991). Mathematical intuitions are not assumed to be static, 
nor are some mathematical concepts assumed to be "abstract" and thus not amenable to intuitive apprehension. 
Learners are supported in building and developing their mathematical intuitions over a lifetime and, through 
this construction process, mathematical objects are seen to be more concrete (Wilensky, 1991; 1993) as 
learning progresses. 

Connected Mathematics views mathematics as something human make a we build tools to operate upon and 
make sense of our world . In contrast to learning procedures or formalisms first, as in the traditional 
curriculum, Connected Mathematics calls for making many more connections between mathematics and the 
world at large as well as between different mathematical domains throughout the learning experience(e.g. 
Cuoco & Goldenberg, 1995). Empowering technology is central to this effort. Technology is used as a 
personally expressive medium - to explore areas of mathematics previously inaccessible, to make abstract 
mathematical concepts concrete, and to create new mathematics In contrast to reform documents such as the 
NCTM standards, which portrays an "image" of mathematics as essentially a problem solving activity, its 
vision of mathematics is a more generative one - the central activity being making new mathematics. A culture 
of design and critique is developed. 

Connected Mathematics pays great attention to the affective side of learning and doing mathematics. As such, 
it pays attention to the role of play, joy wonder and curiosity in the learning of mathematics and to the role of 
desire for social connection as a motivation for engaging in mathematical activity (see also Thurston, 1992). 
Difficulties in learning mathematics are also examined from an affective perspective. Connected Mathematics 
looks critically at the role of shame in mathematical culture - how shame creates, first, anxiety (see e.g., 
Chipman, Krantz & Silver, 1994; Steele, 1995; Tobias, 1993) and, ultimately, negative mathematical 
self-image (Dweck & Leggett, 1988; Steele, 1995) in the individual learner and how it stifles discourse and 
enforces hierarchy (O'Connor, 1993) in the mathematical community. In contrast, a Connected Mathematics 
learning environment fosters an atmosphere in which it is safe for mathematical learners to express their partial 
understandings and values these understandings regardless of their degree of correspondence with the 
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mathematics canon. In doing so, it parts company with the literature on misconceptions which highlights the 
gulf between expert and novice. Connected Mathematics, instead, stresses the continuity between expert and 
novice understanding. This continuity obtains both in their explicit conceptions - novice conceptions are not 
discarded on the road to expertise but rather repurposed, and in their general "messy" character -- even for 
expert mathematicians, only small areas of clarity have been laboriously carved out from the generally messy 
terrain, (see also Smith, diSessa & Roschelle, 1994). 

4.0 Sources of Epistemological Anxiety and the Making of Mathematical Culture 

As discussed above, learners of probability suffer from an attendant anxiety about the source of legitimacy of 
the probabilistic procedures and formulae they are learning. This anxiety is not unique to P&S. Indeed, it is 
quite common in mathematics instruction. 

The sources of this anxiety are multiple. The culture of the mathematics classroom contributes greatly (Hoyles, 
1985; Lampert, 1991; Noss, 1988; O'Connor, 1993). Its insistence on unitary definitions for mathematical 
concepts reinforces the belief that mathematical knowledge is completely disconnected from the rest of the 
learner's knowledge (Cuoco, Goldenberg & Mark 1995; Minsky, 1987-b; Wilensky, 1993). The learner , thus 
poised at the lip of a yawning chasm separating her previous conceptions from the new knowledge, is 
understandably anxious. The formal definitions sever the links to the many related conceptions that could serve 
as bridging mechanisms and could alleviate anxiety. The anxiety is further enhanced by social isolation. The 
insistence on answers or results, absent of intellectual texture, and invalidation of personal voice (Confrey, in 
press; Gilligan, 1977) discourages the learner from expressing her partial and incomplete understanding. This, 
in turn, can create the sense that everyone else is clear about the meaning of these objects and that you alone 
are confused. The further knowledge that any admission of confusion will be used to rank you in the 
mathematics hierarchy and judge your mathematical intelligence makes it practically taboo to share your 
"messy" concepts (Papert, 1972; 1993; Wilensky, 1993) and attendant epistemological anxiety. 

Connected Mathematics provides several sources of therapy for epistemological anxiety. Instead of formal 
definitions, it emphasizes connections to the learner's knowledge that make the transition to new knowledge 
both safer and more meaningful. By exposing the universality of confusion and messy concepts, it reassures the 
learner that her predicament is "normal" and shared. This, in turn, leads to the voicing of previously unvoiced 
concepts and sets the context for the social negotiation of mathematical knowing. This emphasis on making, 
voicing and sharing takes public the activity of understanding mathematics. These changes in mathematical 
culture are both supported and made much more realizable by the advent of computational modeling 
environments. 

4.1 Epistemological Anxiety in the Realm of Probability 

There are several reasons why epistemological anxiety is particularly pronounced in the domain of P&S: 

1) We are living in a time when the meanings of the basic notions of probability theory, the ideas of 
"randomness", "distribution", and even "probability" itself are still quite controversial. There is, as yet, no 
consensual agreement on the foundational concepts of probability f they are still being debated by 
mathematicians and philosophers (see. e.g., Cartwright, 1987; Gigerenzer, 1987; Savage, 1954; Suppes, 1984; 
Von Mises, 1957 ). There is particular disagreement on the applicability of probability to individual instances. 
This leads to confusion about the applicability of probability theory to the unique situations in our lives. It 
raises doubts about its utility in making individual choices which might serve to animate our interest in the 
subject. 

2) When we encounter statistical data (as we commonly do in newspaper articles), it is often detached from its 
method of collection. If a statistic is left vague and disconnected in this way, we can not operate on it, 
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transform it and compare it with others so as to make it meaningful and concrete. 

3) Short term memory limitations make it difficult to collect large amounts of data in "real time". "So instead, 
we construct summaries of the data. These summaries are not analyzable into their constituent elements and 
thus necessarily remain disconnected from the original data. In this way P&S is at a stage of its development 
not unlike other areas of mathematics throughout history. These same kinds of memory constraints once made 
arithmetic very difficult until memory saving technologies (such as zero, place value and generalized decimal 
notation) were invented (See e.g., Kline, 1972; McLuhan, 1964; Rotman, 1987). A central conjecture of the 
Connected Probability project is that computational technology should do for probability what Hindu- Arabic 
notation has done for arithmetic. Computational environments allow large bodies of data to be visualized at 
once in a small space and large numbers of repetitions to happen in a short time. As a result, short term 
memory resource limitations can be overcome. The focus can then be on probabilistic reasoning. 

4) It is very hard to get feedback as to the adequacy of our probability judgments. If we assess that the 
probability is 30% that it will rain tomorrow and it does rain, what have we learned? We require many, many 
such observations in order to calibrate our judgments to the data. Again, memory limitations block us from 
receiving the necessary feedback, Moreover, there is little opportunity for control because the world is 
constantly changing and we cannot repeat the experiments so as to get systematic and. controlled feedback 5) 
Seeing our lives as experiments in probability requires seeing our current situation as one of a large collection 
of similar possible situations. To make this large collection into a "concrete" object of thought (Wilensky, 
1991) requires a massive construction job. It requires forging links between our current situations and 
situations forgotten in the past, situations not yet arisen, and counterfactual situations, (those situations which 
could have arisen but did not) . 

6) A powerful way to think about probability situations is to think of multiple, interacting, distributed entities. 
Again we cannot keep so many entities in working memory (Case, 1993; Miller, 1956) at once. Consequently, 
we construct wholes out of the many interacting parts and understand the behavior of the ensemble in terms of 
its average behavior. 

Memory and resource limitations, absent a notation or medium, do contribute to why, until now, many people 
have not developed robust probabilistic intuitions. We have reached a time where these limitations will no 
longer hold sway in P&S. The emergence of computational modeling environments promises to provide a 
holding environment , a symbolizing medium which can express data distributed over space and time. This 
should enable us to see view dynamic properties of ensembles and to conduct experiments with immediate 
feedback. It is for this reason that one of the most powerful forms of therapy for epistemological anxiety is 
access to a computational environment in which the objects whose epistemological status are in doubt can be 
modeled, experimented with and debugged to the satisfaction of the learner. The computer serves as a setting 
and symbolizing medium for building mathematical intuitions. 

5.0 Computational Modeling in Mathematics and Science Education 

In a computational modeling approach to mathematics and science education, a modeling language and sets of 
associated tools are made available to learners. Learners then choose a concept or phenomenon and create a 
computational model of the phenomenon. In contrast to pre-built simulations, where the learner is interacting 
with an expert model, the model-building approach allows the learner to own and pursue personally meaningful 
investigations. In a model building approach, there are no "black boxes" in the phenomenon of interest - the 
learner constructs her own "boxes". 

The modeling approach has both cognitive and affective benefits. By building computational models of 
everyday and scientific phenomena, learners can develop robust mental models of the underlying probability 
and statistics. The feedback provided by building and then testing the computational model supports the learner 
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in debugging and successively refining (Leron, 1994) the mental model On the affective side, the modeling 
environment supports the learner's expression of partial understanding. In a modeling environment, it is clear 
that partial understandings are the norm - one builds models by, initially, expressing a tentative model in the 
modeling language, then gradually refining and debugging the model. In this way modeling activities short 
circuit the "whole cloth' 1 answer orientation of the culture of classroom mathematics (Sfard & Leron, in 
preparation). Indeed, the activity of model building is so different from normal mathematics for most students 
that they, often, do not identify themselves to be "doing mathematics". ({Feurzeig, 19xx; Mandinach & Cline, 
1994; Thornton, 19xx}). Embedded in a suitable social environment, the opportunity to express partial 
understandings without judgment can go a long way towards alleviating a sense of mathematical shame. 
Relieved from the necessity of "getting it right" the first time, even learners not usually considered good at 
mathematics and science can build models that demonstrate a qualitatively greater level of mathematical 
achievement than is usually found in mathematics classrooms. Computational modeling serves as a powerful 
therapy for epistemological anxiety. 

5.1 Modeling Emergent Phenomena 

One productive domain for computer modeling is "emergent phenomena", in which global patterns emerge 
from local interactions. Emergent phenomena can provide rich contexts for learners to build with probabilistic 
parts. Learners can explore the stable structures that emerge when probabilistic behavior is given to distributed 
computational agents. Thus, instead of encountering probability through "solving" sets of decontextualized 
combinatoric problems, learners can participate in constructionist activities - they design and build with 
probability. 

Probability distributions can be seen as canonical cases of emergent phenomena. They are stable global 
structures that arise from the interactions of multiple distributed agents. Typically, in statistics classes, the 
emergent nature of distributions such as the normal distribution (or "bell curve") is hidden. We "learn" about 
distributions through descriptions of their global characteristics (e.g., mean, standard deviation, variance, skew, 
moments). This conceals the way these distributions are built up or emerge from individual instances. As a 
result, the critical connection between the micro- level of the phenomenon and the macro- level is severed. The 
learner is given the macro- description and formulae and is asked to accept on authority that these macro- 
descriptions are appropriate for such and such a set of phenomena. This supplanting of experimentation with 
external authority contributes significantly to epistemological anxiety. 

5.2 Object-based Parallel Modeling 

The modeling environment used in the Connected Probability project, is the language StarLogo (Resnick, 
1992; 1994, Wilensky, 1993; 1995b) extended to be especially useful for modeling phenomena in P&S. 
StarLogo is an extension of the Computer language Logo. In Logo, a graphical turtle is controlled by issuing 
movement commands. In StarLogo, however, the user can control thousands of "turtles" or "agents". StarLogo 
is an example of a new kind of modeling language a it is an object-based parallel modeling language. Other 
examples of such languages include KidSim (Smith, Cypher & Spohrer, 1994) and AgentSheets (Repenning, 
1993). "Parallel" means that the agents behave as if they are all executing their commands simultaneously. 
StarLogo was originally implemented on the Connection Machine, a massively parallel supercomputer, and 
was thus in fact parallel, each agent controlled by a separate processor. In more recent implementations, 
StarLogo runs on serial machines such as the Macintosh computer, so it is running as a parallel "virtual 
machine" on top of a serial architecture. 

"Object-based" means that each agent is self-contained: it has its own internal state and communicates with 
other agents primarily by local channels - agents don't do much action at a distance. The computer language 
Logo had a single such object - the "turtle". Papert (1980) argued that the power of the turtle to facilitate 
learning geometry lay in the fact that the child could identify with the turtle - enabling "syntonic" learning. 
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Object-based parallel modeling languages such as StarLogo afford greater identification with their objects, and 
thus, in contrast to more procedural languages, foster syntonic learning of emergent phenomena. 

Learners can assign micro- level rules to simultaneously control thousands of such turtles. Macro- phenomena 
emerge in relation to these micro- rules. This makes both forwards and backwards modeling possible - that is 
1) exploring the global effects of various sets of local rules, and 2) trying to produce a known global 
phenomenon through finding appropriate local rules (Wilensky, 1995a). 

Such object-based parallel modeling languages can be powerful environments for doing P&S. One way to think 
of these agents is as multiple sided irregular dice. At each step in running a model built in one of these 
languages, thousands of dice are thrown and results visually displayed. This use of object based parallel 
modeling languages is reminiscent of the technique of Monte Carlo simulations and shares feature with 
resampling statistics approaches (Diaconis & Bradley 1983; Konold, 1994; Simon et al, 1976; Simon & Bruce, 
1991). Unlike these other approaches, in object-based parallel modeling languages, control of the program is 
distributed through the agents rather than centrally controlled and the agents themselves are inspectable, 
visualizable, "concrete" objects. 

Even at this basic level of "dice rolling", many experiments in probability and statistics are suggested. Indeed, 
most of the standard elementary curriculum in P&S can be easily modeled in StarLogo using these 
"dice-rolling" features of the language alone. In a non-computational setting, experiments with large data sets 
are perforce constrained to be summarized by mediating global formulae and statistics. Experiments can be 
done for small data sets and small numbers of trials and is typically confined to several con-flips, dice-rolls and 
spinner-spins. But once the scale is increased, the ability to experiment is gone. As discussed earlier, the 
critical connection between the micro- level of the phenomenon and the macro- level is severed. 

Additionally, in object-based parallel modeling languages, the "dice" can also interact with each other. As a 
result, many complex interactions are possible - the results of which are not predictable using standard 
mathematical models. The ability to experiment with such complex interactions and get visual feedback 
however, allows learners to make qualitative sense of different possible patterns, classifying meta-patterns and 
noting trends such as feedback cycles, critical densities, clustering, etc. Even though these agents - thought of 
as dice - are probabilistic components whose individual states cannot be predicted, the emergent properties of 
the collection of agents can be stable and predictable. This insight is essential to grasping the key notion of 
probability distribution. For this reason, object-based parallel modeling languages are excellent constructionist 
environments for learning probability — the learner is constructing objects using probabilistic components. This 
situation is analogous to the acquisition of print literacy as it includes both reading and writing/Instead of just 
"reading" about probability (by being given the summary formulae), the learner "writes" with probability. 

6.0 Case Studies 

The remainder of this paper will be devoted to two case studies. The cases, taken from Connected Probability 
interviews, are offered as illustrations of epistemological anxiety concerning the notion of normal distribution 
and, in one case, a successful therapy through StarLogo modeling in a Connected Mathematics context. 

6.1 Normal Anxiety - Lynn f s tricky distributions 

Let me introduce Lynn, a psychologist in her mid-thirties who at the time of her interview had recently received 
her Ph.D. For her dissertation, Lynn had done a statistical comparison study of different treatment modalities 
for improving the reading comprehension of dyslexic children. While in graduate school, she had taken an 
introductory probability and statistics class as well as two classes in statistical methods. As such she would 
"naturally" be classified as having a fairly sophisticated background. In this interview fragment we will see that 
despite her sophisticated background, basic ideas of randomness and distribution are alienated for her - neither 



8 of 22 



7/7/2006 10:31 AM 



'What is Normal Anyway? Therapy for Epistemological Anxiety http://ccl.northwestern.edu/papers/normal/ 

trusted nor appropriated for her ends. Even though Lynn was capable of applying formal statistical methods in 
her coursework, she was fundamentally confused about the "madness behind the method". In this interview, 
she starts to negotiate meaning for "normal distribution". While her attempts take her to a position vis a vis 
distributions which would be considered just plain wrong in most university courses and may indeed be 
incoherent, she is for the first time grappling with the meaning of the fundamental ideas of randomness and 
distribution. In so doing, she takes a step towards developing intuitions about and appropriating these concepts. 

As background to the interview, I asked Lynn about her attitudes towards mathematics. Her reply was 
indicative of the issues that would arise in the body of the interview: 

U: So, what was math like for you in school? 

L: Well, I was always good at math. But, I didn 't really like it. 

U: Why was that? 

L: Why? I donet know. I guess I always felt like I was getting away with something, you know, like I was 
cheating. I could do the problems and I did well on the tests, butldidnet really know what was going on. 

One of the first questions that arose in the body of the interview with Lynn was: What would you estimate is 
the probability that a woman in the US would be at least 5'5" tall? Here is the text of the ensuing conversation: 

L: Well I guess it would be about 1/2. 

U: Why do you say that? 

L: Well height is normally distributed and I'd say the mean height of women is about 5' 5" so half of the women 
would be taller than 5'5". 

U: Why would half the women be taller than the mean? 

L: Because the curve of the normal distribution is symmetric around the mean - so half would be below it and 
half above it. 

U: What about the number that are exactly 5 '5" tall? 

L: Well I guess they could make a difference, but no - they shouldn't matter because they're just one data point 
and so can be ignored. 

U: You can ignore any one data point? 

L: Yes, .... because there are infinitely many data points so one doesn't make a difference. 

U: But can you make infinitely many height discriminations? How do you measure height? 

Here, I'm just trying to probe Lynn's thinking about discrete vs. continuous distributions. 

L: Well.... I guess we measure it in inches - so there probably aren't infinitely many data points. I'm somewhat 
confused because I know height is distributed normally and I know that for normal distributions the probability 
is 0.5 of being bigger than the mean, but how come you can ignore the bump in the middle? I guess 0.5 is just 
an estimate, it's approximately 0.5. 

U: How do you know height is distributed normally? 
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L: I don't remember where I first learned it, but lots of the problems in the stats books tell you that. 



My question was intended to elicit an explanation of why Lynn believed height was distributed normally i 
terms of her personal knowledge, but Lynn responded to it as about probing the context of gaining that 
knowledge and the authority it derives from. 

U: Why do you think height is distributed normally? 



L: Come again? (sarcastic) 

U: Why is it that women's height can be graphed using a normal curve! 



L: That's a strange question. 
U: Strange? 

L: No one's ever asked me that before (thinking to herself for a while) I guess there are 2 possible theories: 

Either it's just a fact about the world, some guy collected a lot of height data and noticed that it fell into a 
normal shape 



U: Or? 

L: Or maybe it's just a mathematical trick. 
U: A trick? How could it be a trick? 

L: Well... Maybe some mathematician somewhere just concocted this crazy function, you know, and decided 
say that height fit it. 



U: You mean... 



L: You know the height data could probably be graphed with lots of different functions and the normal curve 
was just applied to it by this one guy and now everybody has to use his function. 

U: Soyouere saying that in the one case, it's a fact about the world that height is distributed in a certain way, 
and in the other case, it's a fact about our descriptions but not about height? 



L: Yeah. . . : • 

U: Well, if you had to commit to one of these theories, which would it be? 
L: If I had to choose just one? 
U: Yeah. 

L: I don 't know. That's really interesting. Which theory do I really believe? I guess I've always been uncertain 
which to believe and it's been there in the background you know, but I don't know. I guess if I had to choose, if 
I have to choose one, I believe it's a mathematical trick, a mathematician's game. .... What possible reason 
could there be for height, ....for nature, to follow some weird bizarro function? 

The above was a short section of the first probability interview I conducted . Until the last exchange 
transcribed, I had the feeling that the interview was not very interesting for Lynn. But after that last exchange 
she got very animated and involved in the interview. This question of the reality vs. "trickiness" of the normal 
and other distributions occupied her for much of the next few discussions. She came back to it again and again. 
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At the time, I was a bit surprised by this reaction. Earlier in the interview, when I asked her about infinitely 
many height discriminations, I think I was trying to steer Lynn into making a distinction between a discrete and 
continuous distribution. This, I thought, might have resolved her dilemma about ignoring the "bump in the 
middle". But I was not expecting to hear such basic doubts about the validity of statistics from someone with so 
much experience studying and using statistical methods . 

Lynn's question was, as she says, "in the background" all the time when she thought about statistics. How could 
it not be? At the same time that she was solving probability and statistics problems for homework, and later on 
when she actually implemented a study which used statistical tests to evaluate treatments, she was never really 
sure that the formulae she was using had any explanatory power, any real meaning other than mathematical 
conventions. Is it any wonder, then, that she was engaged by this question? Up until now, no one had engaged 
her in discussion about how to interpret the mathematical formulae she was using. She was left feeling that her 
own work played by the "official rules" but in a way was not accessible to her - and she harbored doubts about 
its validity. Her questions, while essentially epistemological, go to the heart of the mathematics of probability. 
Note that, in emphasizing the normal distribution, her courses had led her to see the process of describing 
height by a curve as a single process with a single agent (possibly the mathematician) controlling it. The idea of 
the height distribution as being emergent from many interacting probabilistic factors was not in awareness. 

Had we tested Lynn on textbook statistics problems, she would have performed flawlessly . Yet, even in a 
interviewee of such sophisticated background, epistemological anxiety was lurking in the background. Lynn 
knew that she did not really know what gave validity to the procedures she had so laboriously mastered. 
Through engaging in this interview, her fundamental confusion is exposed. To an uninformed observer, it 
might seem that the interview has caused Lynn to regress, confusing her about material she has mastered. But, 
in fact, by expressing her partial understandings of distributions through asking whether normal distributions 
are "just a mathematician's trick", Lynn takes a big step towards a more Connected Mathematical 
understanding of distributions. 

In the next interview, we will see how another interviewee, engaged in building a StarLogo model of normal 
distributions, deals with these same issues. 

6.2 Modeling Normal Behavior - Alan's Hopping Rabbits 

Alan, a graduate student in media studies had a strong college mathematics background. Although, he is 
mathematically quite sophisticated, Alan also expresses confusion about normal distributions: 

A: I never really understood normal distributions. I mean I can do the problems, I've even helped Wendy [his 
wife] with her statistics problems, but what is really going on? Why should height fall into a normal 
distribution? 

As the interview progressed, Alan became intrigued by the question: "why is height normally distributed?" 
After some prodding by the interviewer to come up with an answer to his question, Alan made a conjecture as 
to why height is normally distributed: 

A: You start out with this one couple, Adam and Eve say, and they're a certain height No, make this simpler, 
we just have Adam and he's a certain height. Now let's suppose we have parthenogenesis, (is that the word?) 
and Adam has kids on his own. And suppose his kids are a bit off from where he is [that is, slightly different 
heights] due to copying errors or something. Then they sort of form a little bell curve - anyway a binomial 
distribution. Now suppose they have children and that process continues many generations, then wouldn't you 
wind up with a normally distributed population? 

To explore this conceptual model, Alan decided to write a StarLogo model with me. The model is initialized to 
have thousands of "rabbits" at the bottom middle of the graphics screen. (Call the initial location of the rabbits 
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the "origin". For reference, a green vertical line is placed at the origin.) Rabbits can hop right or left from the 
origin on a line at the bottom of the screen. (Call it the hopping line) Each rabbit has a set of possible step sizes 
which it can hop in one clock "tick". Associated with each step size is a probability - the probability that the 
rabbit will choose that step size. To make the interface easier for himself and other possible users who might 
not be comfortable with probabilities (expressed as real numbers between 0 and 1), the probabilities of rabbit 
steps are expressed as integer relative odds. The rabbits are initialized with the command "setup-rabbits" which 
takes three arguments: the number of rabbits, a list of step-sizes and a list of probability-ratios. So, for 
example, if one wanted to create 1000 rabbits that could hop to the right 2 units with probability 3/4 and hop to 
the left 1 unit with probability 1/4, one would type the command 

setup-rabbits 1000 [2 -1] [3 1]. 

After executing this command, the rabbits would be all piled on top of each other at the middle bottom of the 
screen. It is then possible to perform experiments and let them hop for a while. At each clock cycle each rabbit 
picks a step to hop according to the probabilities and hops that distance on the hopping line. Once the rabbits 
have hopped for one clock tick, they are no longer all in the same place. They are still on the hopping line, but 
at different x positions on that line. Above each position at which there are rabbits present, we display a yellow 
column - the height of which is proportional to the number of rabbits at that location. In this way an evolving 
"living" histogram of the distribution of rabbit locations on the hopping line appears on the screen. 

To make this a bit clearer, letes take a simple example: 

If we initialize the rabbits with the command setup-rabbits 8000 [1 -1] [1 1], then 8000 rabbits will appear at 
the origin. Each rabbit has been initialized so that it can only hop either one unit to the left or one unit to the 
right with equal probability. If we let them hop one step, then because the probability of moving left is the 
same as the probability of moving right, approximately half the rabbits will move one unit left and another half 
will move one unit right. As a result two roughly equally high yellow columns will appear on the screen above 
locations x=l and x=-l . If we now let the rabbits hop one more step, there will be a tall yellow column in the 
middle and shorter, roughly equally high, yellow columns to the right and to the left. Continuing this process 
will lead to histograms of binomial distributions symmetric about the origin. (See figure below). 




In the above simple example, when the rabbits hop, their average location does not move from the origin a the 
histogram of rabbits is symmetric about the origin. To help visualize the movement of the rabbits when the 
average location does change, a purple column is displayed above the average location and a green column 
"remembers" where the rabbits started. Natural questions that arise in this rabbit jumping "microworld", then, 
are: what will cause the average location of the rabbits to change and what will cause more of them to be one 
side of the average than on the other? 

As you may recall, the reason Alan and I created this model was to investigate Alan's theory of height 
distributions. One way to think of the rabbits when they're all at the origin is as a population of individuals all 
of a standard height. When rabbits take a step they represent new individuals in the population who have 
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deviated slightly from their parents' height. In our simple example the average height of the population does not 
change and the population heights are symmetric about this average. 

When viewing this example, Alan exclaimed: "I see why population heights are symmetrical, the rabbits hop 
the same distance to the right as to the left". He then typed in the command setup-rabbits 8000 [2 -1] [1 1]. 
"Now", he said, "the graph won't be symmetrical, it will skew to the right". But to Alan's surprise, when the 
rabbits hopped, the histogram remained symmetrical, but the purple line representing the average rabbit 
location (or average population height) moved to the right. 

Alan quickly understood the "bug" in his thinking: 

"I see. Rabbits hop farther to the right when they hop but they're still just as likely to hop right as left. So, the 
distribution will remain symmetric, it will just move to the right one unit on average for each step. To make the 
distribution asymmetric, you need to change the probabilities not the step sizes. " 

llll setup after hops 

From the above experiment, Alan concluded: 

If children are just as likely to be shorter as they are taller than their parents but when they are taller their 
difference from their parents is larger, then the population as a whole will slowly get taller. But since we know 
that height of the population as a whole is symmetrically distributed, children must be equally likely to be 
taller as they are to be shorter than their parents. I see why normal distributions are so common. Whenever we 
make a measurement, we're just as likely to make a mistake in one direction as the other, so the resultant 
distribution of measurements will be normal Its average will be the true value of the measurement and the 
spread of the graph will depend on the accuracy of the measurement. 

Alan then went on to investigate what would happen to the distributions if he gave the rabbits a much larger 
"palette" of step sizes. This investigation leads him to more insights about how different micro- level rules 
produce different probability distributions. When I asked Alan to sum up what he has learned from his 
modeling experience, he expresses satisfaction that he has gained a deeper understanding of binomial 
distributions and "what kind of animal a distributions is". But, he also says that perhaps he has not yet "entirely 
understood normal distributions". He speculates that the normal distribution is "the limit of the binomial 
distribution as "n" approaches infinity ~ when you fill in the gaps between the bars of the binomial". He 
worries, however, that he doesn"t yet sufficiently understand the connection between the binomial distribution 
and the formula he remembers for normal distributions: "how does the integral of e -x2/2 come out of this?". 

It is important to remember that Alan began the interview with a command of elementary probability and 
statistics. He was facile with the mathematics of binomial processes and knew the standard parameters and 
statistics of normal distributions. It was the connection between these two areas of his knowledge that was 
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missing for Alan. School mathematics, like pre-nineteenth century mathematics keeps these two disciplines 
separate. Indeed, with limited computational power, it is hard for a school curriculum to make a meaningful 
connection between the two. Object-based parallel modeling languages bring the two back together. By 
enabling Alan to give probabilistic rules to the individuals and see the resulting pattern of the ensemble, the 
StarLogo modeling language enabled him to make these connections. Though Alan has made significant 
progress in his understanding of distributions, he is not done. His desire to further connect his new 
understanding to the formula he learned in school exemplifies his appropriation of the Connected Mathematics 
approach. 

7.0 Discussion 

It is clear that the source of Lynn and Alan's difficulty with P&S did not lie in a lack of technical proficiency. 
By most standards, their understanding was advanced. The issue is that their understanding was not well 
connected. The case studies demonstrate their ability to make these connections and develop rich intuitions in 
areas, which, in the past, they accommodated by mechanically applying formulae. This supports the Connected 
Probability project's stance of eschewing the accomodationist view which says "that's the way it is" for P&S, 
and, instead, viewing P&S as fitting within the larger framework of research in understanding mathematics. 
Development of intuition and changes in understanding can happen in ways consistent with a broad 
developmentalist stance. In this discussion, I emphasize the features of the Connected Mathematics therapy that 
facilitated the development of effective intuitions about normal distributions. 

Both Lynn and Alan had strong mathematical backgrounds, had done graduate coursework in mathematics and 
had excelled. Both of them were very comfortable doing the formal manipulations that constitute the major 

component of traditional instruction in P&S. Nonetheless, both explicitly expressed considerable discomfort 

concerning the notion of normal distribution a the most salient example of the core statistical notion of 
probability distribution. Regarding the nature of normal distributions, they suffered from epistemological 
anxiety. The therapy provided in these learning interviews consisted of two main interventions: 1) Validation 
of the importance of Lynn and Alan's mathematical voice, their personal view of the mathematics and its 
relation to their own experience. This validation was achieved by explicit avowal at the start of the interview, 
by open-ended interviews that followed the interviewee's own thinking and by long drawn out interviews that 
declared of themselves that the interviewee's conceptions were valuable and valued. 2) The use of an 
object-based parallel modeling language, StarLogo, in which to build computational models of probabilistic 
phenomena and successively refine them. These supports provided a context in which to work through their 
epistemological anxiety. 

Though, clearly, neither Lynn nor Alan were experts in P&S, their sophisticated mathematics background as 
well as their formal mastery of P&S techniques would qualify them as experts in the eyes of many educators. 
These "near-experts" were no less confused about normal distributions than were the novice interviewees. 
Indeed, instructors of P&S with whom I have shared Lynn's interview have been "appalled at the incoherence 
of her remarks ", yet, instructors like these, awarded top grades to Lynn in her P&S courses. Interviews such as 
these cast serious doubt on theories which assert a strong disjunction between novice and expert understanding. 
Even experts must laboriously map out clarity from a very messy "terrain". By overemphasizing formal 
operational thought, typical mathematics education allows intellectual accommodation to be an adequate 
endpoint. In Connected Mathematics, understanding is taken to mean "multiply connected" and not cleanly and 
uniquely specified. This embraces "messy" conceptions at all levels of expertise in the process of learning. One 
has never "arrived". 

Rather than emphasizing problem solving, Connected Mathematics emphasizes problem posing. In problem 
solving, the questions are often disembodied and not the learneres own. Within Connected Mathematics, 
modeling is both a medium in which learners can formulate their questions in a precise way and a method for 
building their understanding. 



14 of 22 



7/7/2006 10:31 AM 



"What is Normal Anyway? Therapy for Epistemological Anxiety http://ccl.northwestern.edu/papers/normal/ 

Once Alan posed the problem of the symmetry of the distribution, his reasoning is expressed and refined in 
interaction with the model he has built. Alan strives to find the connections between his representation of the ._ 
individual rabbits behaviors and the global pattern that they form. He debugs his model by varying each of the 
three inputs to the run-rabbits procedure, focusing particularly on the lists of steps and ratios. How will 
variations in the lengths of the lists and the relative magnitudes of the steps and ratios affect this emergent 
pattern? He alternates his focus from the micro- level, the behavior of the individual rabbits to the macro- level 
of the distribution. To better keep track of the change in the global pattern over time, he takes advantage of 
StarLogoSs open-ended programmability and builds representational aids such as the "origin line", the 
"hopping line" and color-coded classes of rabbits. These representations and tools allow him to debug his 
original conception that the relative size of the steps is the key to an asymmetric distribution and, instead, to 
see the relative size of the probabilities as the key factor. By building distributions with probabilistic parts and 
making them work, Alan makes probability ratios as concrete for himself as step-sizes, and sees how 
distributions are built out of these concrete building blocks. Extending, enhancing and debugging the 
computational model itself is essential to the activity of modeling. This distinguishes it from model 
consumption in which learners are left to tweak the inputs of an expert's model. 

In the Connected Probability project, probability distributions are viewed as a kind of emergent phenomena. 
Rather than manipulate the tokens of formalism, learners are invited to negotiate the interaction of the micro- 
and macro- levels of probabilistic and statistical phenomena. Through connecting these levels of emergent 
phenomena, learners understand the "mechanisms" of probability distributions - how they are assembled from 
their constituent elements. When these interactions are understood, the global patterns become more than just 
descriptions to be memorized, they are tools to be used. 

Binomial probability distributions are elementary examples of emergent phenomena ~ emerging from the 
behavior of non-interacting individuals. Other probability distributions emerge from more interactive 
individual behavior. Many natural phenomena can be modeled as systems whose global behavior are not 
predictable using non-computational mathematics. Such phenomena as the shape of a snowflake, the dynamics 
of co-extenisve wolf and sheep populations, the economy of a country ~ all of these can be modeled as 
emerging from the interactions of their constituent individuals. Learning to see emergent phenomena - that is 
learning to describe the phenomena as arising from the interactions of distributed parts - is, in itself, a worthy 
goal for educators. From this perspective, P&S becomes not an end, but an entry point to the world of complex 
systems. 

In the Connected Probability project, the role of technology is as a medium of personal expression and 
articulation of ideas. Empowering literacy implies being able to say something of your own, rather than just 
manipulate the texts of others. While models are ubiquitous in the current learning literature, modeling is not. 
Modeling calls on learners to say something on their own, to be authors of mathematics. This implies the best 
use of technology is for learners to be model builders not just model consumers. The StarLogo modeling 
language has specific features which allow for this articulation: 

Modeling at the level of objects: StarLogo allowed Alan to create, visualize and modify individual rabbits. 
Alan was, thus, able to experiment with individual rabbits whose hopping behavior he could track, control, 
model with his own body. These features of the language enabled Alan to leverage his knowledge of individual 
behavior to knowledge of the ensemble a to connect the micro- with the macro-. 

Parallelism: The ability to control thousands of rabbits at once gave Alan the experimental apparatus he 
needed to conduct his investigation in "real time". The ability to experiment with collections of rabbits, to 
enable sub-populations of rabbits to have different behaviors was crucial to his construction of the distribution. 

General purpose programmability: The ability to create his own representations (as opposed to manipulating 
the parameters of a pre-conceived model) gave Alan the freedom to construct normal distributions in a way that 
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arose from his own nascent understanding of them. It also enabled him to create computationally active tools, 
such as the "hopping line". These refinements of his computational model provided greater support for his 
experimentation and, in turn, led to a refinement of his mental model. 

In a Connected Mathematics learning environment, learners are supported in actively connecting areas of their 
knowledge that have, hitherto, remained separate. Computational platforms, in their ability to bring into being 
new forms of representation and symbolization, can be powerful tools for making these connections. 
Object-based parallel modeling provides a new way of coming to understand and articulate understandings of 
probability and statistics and, in so doing, engage and relieve epistemological anxiety. 
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